-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
in_node_exporter_metrics: add support for thermal_zone. #7522
Conversation
Here is a run using valgrind: valgrind ./bin/fluent-bit -i node_exporter_metrics -p me
trics=thermal_zone -o stdout -m '*' -o exit -m '*' -f 1
==515970== Memcheck, a memory error detector
==515970== Copyright (C) 2002-2022, and GNU GPL'd, by Julian Seward et al.
==515970== Using Valgrind-3.21.0 and LibVEX; rerun with -h for copyright info
==515970== Command: ./bin/fluent-bit -i node_exporter_metrics -p metrics=thermal_zone -o stdout -m * -o exit -m * -f 1
==515970==
Fluent Bit v2.1.5
* Copyright (C) 2015-2022 The Fluent Bit Authors
* Fluent Bit is a CNCF sub-project under the umbrella of Fluentd
* https://fluentbit.io
[2023/06/06 19:46:51] [ info] [fluent bit] version=2.1.5, commit=3fdd42c6f2, pid=515970
[2023/06/06 19:46:51] [ info] [storage] ver=1.4.0, type=memory, sync=normal, checksum=off, max_chunks_up=128
[2023/06/06 19:46:51] [ info] [input:node_exporter_metrics:node_exporter_metrics.0] path.procfs = /proc
[2023/06/06 19:46:51] [ info] [cmetrics] version=0.6.1
[2023/06/06 19:46:51] [ info] [input:node_exporter_metrics:node_exporter_metrics.0] path.sysfs = /sys
[2023/06/06 19:46:51] [ info] [ctraces ] version=0.3.1
[2023/06/06 19:46:51] [ info] [output:stdout:stdout.0] worker #0 started
[2023/06/06 19:46:51] [ info] [input:node_exporter_metrics:node_exporter_metrics.0] initializing
[2023/06/06 19:46:51] [ info] [input:node_exporter_metrics:node_exporter_metrics.0] storage_strategy='memory' (memory only)
[2023/06/06 19:46:51] [ info] [input:node_exporter_metrics:node_exporter_metrics.0] thread instance initialized
[2023/06/06 19:46:51] [ info] [sp] stream processor started
[2023/06/06 19:46:56] [error] [/home/pwhelan/Projects/personal/fluent-bit/plugins/in_node_exporter_metrics/ne_utils.c:117 errno=61] No data available
2023-06-06T23:46:56.265368405Z node_thermal_zone_temp{zone="0",type="acpitz"} = 16.800000000000001
2023-06-06T23:46:56.265368405Z node_thermal_zone_temp{zone="1",type="acpitz"} = 16.800000000000001
2023-06-06T23:46:56.265368405Z node_thermal_zone_temp{zone="2",type="acpitz"} = 16.800000000000001
2023-06-06T23:46:56.306184135Z node_cooling_device_cur_state{name="0",type="Processor"} = 0
2023-06-06T23:46:56.306184135Z node_cooling_device_cur_state{name="1",type="Processor"} = 0
2023-06-06T23:46:56.306184135Z node_cooling_device_cur_state{name="10",type="Processor"} = 0
2023-06-06T23:46:56.306184135Z node_cooling_device_cur_state{name="11",type="Processor"} = 0
2023-06-06T23:46:56.306184135Z node_cooling_device_cur_state{name="12",type="Processor"} = 0
2023-06-06T23:46:56.306184135Z node_cooling_device_cur_state{name="13",type="Processor"} = 0
2023-06-06T23:46:56.306184135Z node_cooling_device_cur_state{name="14",type="Processor"} = 0
2023-06-06T23:46:56.306184135Z node_cooling_device_cur_state{name="15",type="Processor"} = 0
2023-06-06T23:46:56.306184135Z node_cooling_device_cur_state{name="16",type="Processor"} = 0
2023-06-06T23:46:56.306184135Z node_cooling_device_cur_state{name="17",type="Processor"} = 0
2023-06-06T23:46:56.306184135Z node_cooling_device_cur_state{name="18",type="Processor"} = 0
2023-06-06T23:46:56.306184135Z node_cooling_device_cur_state{name="19",type="Processor"} = 0
2023-06-06T23:46:56.306184135Z node_cooling_device_cur_state{name="2",type="Processor"} = 0
2023-06-06T23:46:56.306184135Z node_cooling_device_cur_state{name="20",type="Processor"} = 0
2023-06-06T23:46:56.306184135Z node_cooling_device_cur_state{name="21",type="Processor"} = 0
2023-06-06T23:46:56.306184135Z node_cooling_device_cur_state{name="22",type="Processor"} = 0
2023-06-06T23:46:56.306184135Z node_cooling_device_cur_state{name="23",type="Processor"} = 0
2023-06-06T23:46:56.306184135Z node_cooling_device_cur_state{name="3",type="Processor"} = 0
2023-06-06T23:46:56.306184135Z node_cooling_device_cur_state{name="4",type="Processor"} = 0
2023-06-06T23:46:56.306184135Z node_cooling_device_cur_state{name="5",type="Processor"} = 0
2023-06-06T23:46:56.306184135Z node_cooling_device_cur_state{name="6",type="Processor"} = 0
2023-06-06T23:46:56.306184135Z node_cooling_device_cur_state{name="7",type="Processor"} = 0
2023-06-06T23:46:56.306184135Z node_cooling_device_cur_state{name="8",type="Processor"} = 0
2023-06-06T23:46:56.306184135Z node_cooling_device_cur_state{name="9",type="Processor"} = 0
2023-06-06T23:46:56.306184135Z node_cooling_device_max_state{name="0",type="Processor"} = 10
2023-06-06T23:46:56.306184135Z node_cooling_device_max_state{name="1",type="Processor"} = 10
2023-06-06T23:46:56.306184135Z node_cooling_device_max_state{name="10",type="Processor"} = 10
2023-06-06T23:46:56.306184135Z node_cooling_device_max_state{name="11",type="Processor"} = 10
2023-06-06T23:46:56.306184135Z node_cooling_device_max_state{name="12",type="Processor"} = 10
2023-06-06T23:46:56.306184135Z node_cooling_device_max_state{name="13",type="Processor"} = 10
2023-06-06T23:46:56.306184135Z node_cooling_device_max_state{name="14",type="Processor"} = 10
2023-06-06T23:46:56.306184135Z node_cooling_device_max_state{name="15",type="Processor"} = 10
2023-06-06T23:46:56.306184135Z node_cooling_device_max_state{name="16",type="Processor"} = 10
2023-06-06T23:46:56.306184135Z node_cooling_device_max_state{name="17",type="Processor"} = 10
2023-06-06T23:46:56.306184135Z node_cooling_device_max_state{name="18",type="Processor"} = 10
2023-06-06T23:46:56.306184135Z node_cooling_device_max_state{name="19",type="Processor"} = 10
2023-06-06T23:46:56.306184135Z node_cooling_device_max_state{name="2",type="Processor"} = 10
2023-06-06T23:46:56.306184135Z node_cooling_device_max_state{name="20",type="Processor"} = 10
2023-06-06T23:46:56.306184135Z node_cooling_device_max_state{name="21",type="Processor"} = 10
2023-06-06T23:46:56.306184135Z node_cooling_device_max_state{name="22",type="Processor"} = 10
2023-06-06T23:46:56.306184135Z node_cooling_device_max_state{name="23",type="Processor"} = 10
2023-06-06T23:46:56.306184135Z node_cooling_device_max_state{name="3",type="Processor"} = 10
2023-06-06T23:46:56.306184135Z node_cooling_device_max_state{name="4",type="Processor"} = 10
2023-06-06T23:46:56.306184135Z node_cooling_device_max_state{name="5",type="Processor"} = 10
2023-06-06T23:46:56.306184135Z node_cooling_device_max_state{name="6",type="Processor"} = 10
2023-06-06T23:46:56.306184135Z node_cooling_device_max_state{name="7",type="Processor"} = 10
2023-06-06T23:46:56.306184135Z node_cooling_device_max_state{name="8",type="Processor"} = 10
2023-06-06T23:46:56.306184135Z node_cooling_device_max_state{name="9",type="Processor"} = 10
^C[2023/06/06 19:46:59] [engine] caught signal (SIGINT)
[2023/06/06 19:46:59] [ warn] [engine] service will shutdown in max 5 seconds
[2023/06/06 19:47:00] [ info] [engine] service has stopped (0 pending tasks)
[2023/06/06 19:47:00] [ warn] [input:node_exporter_metrics:node_exporter_metrics.0] Unknown metrics: thermal_zone
[2023/06/06 19:47:00] [ info] [output:stdout:stdout.0] thread worker #0 stopping...
[2023/06/06 19:47:00] [ info] [output:stdout:stdout.0] thread worker #0 stopped
==515970==
==515970== HEAP SUMMARY:
==515970== in use at exit: 0 bytes in 0 blocks
==515970== total heap usage: 3,802 allocs, 3,802 frees, 2,640,178 bytes allocated
==515970==
==515970== All heap blocks were freed -- no leaks are possible
==515970==
==515970== For lists of detected and suppressed errors, rerun with: -s
==515970== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0) There are no memory leaks as far as I can ascertain. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me. In my local box, this PR is working as expected.
@pwhelan @leonardo-albertovich is this ready to go ? |
I just added the checks for the calls to flb_sds_cat_safe. I kept the variable names the same since it follows the conventions used throughout the @edsiper As far as I'm concerned it's ready to go. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not having much insight of join_a and join_b logic but I did the same implementation for https://github.com/fluent/fluent-bit/pull/7522/files#diff-f32396faf19457b78c990f276acfbb6dde174a2ba49f99197ce827d312659641R212-R216.
This could prevent duplicating a sysfs mount point.
I checked the PR for Intel and AMD(Ryzen) platforms and it works well.
@leonardo-albertovich I'd be happy to open a new PR that renames *** edit *** I already pushed changes to rename both of them as just |
3a332c5
to
e945bd1
Compare
Signed-off-by: Phillip Whelan <[email protected]>
…ne gauges. Signed-off-by: Phillip Whelan <[email protected]>
…al_zone. Signed-off-by: Phillip Whelan <[email protected]>
Signed-off-by: Phillip Whelan <[email protected]>
Signed-off-by: Phillip Whelan <[email protected]>
…oks possibly redundant with comments. Signed-off-by: Phillip Whelan <[email protected]>
…tamp. Signed-off-by: Phillip Whelan <[email protected]>
Signed-off-by: Phillip Whelan <[email protected]>
Signed-off-by: Phillip Whelan <[email protected]>
…m rebase. Signed-off-by: Phillip Whelan <[email protected]>
…thermal_zone. Signed-off-by: Phillip Whelan <[email protected]>
eee3b8a
to
f9191af
Compare
This could be adopted for more pluggable structure. @edsiper Any missing pieces of getting to be merged? |
@pwhelan Any chance to register the corresponding document for this PR? |
I already adapted to @nokute78's new pluggable structure. I added a documentation PR: fluent/fluent-bit-docs#1254. There's not much for individual plugins in the documentation for the node_exporter_metrics plugin. |
Related to fluent/fluent-bit#7522. Signed-off-by: Phillip Whelan <[email protected]>
thanks everyone, merging this now. note: I will squash the commits since all belongs to the new thermal_zone collector functionality |
* in_node_exporter_metrics: add a reference to thermal_zone. Related to fluent/fluent-bit#7522. Signed-off-by: Phillip Whelan <[email protected]> * in_node_exporter_metrics: update interval property for thermal_zone. Add description for the thermal_zone interval configuration property. Signed-off-by: Phillip Whelan <[email protected]> --------- Signed-off-by: Phillip Whelan <[email protected]> Signed-off-by: Pat <[email protected]> Co-authored-by: Pat <[email protected]>
Summary
This patch adds support for reading temperature values from
/sys/calss/thermal_zone
. The labels are replicated from the same behaviour as the prometheus node_exporter.This plugin provides thermal reporting on linux/arm64, especially for raspberry pi 4 and other similar SBCs.
These sensors are already implemented in the prometheus node_exporter_metrics: https://github.com/prometheus/node_exporter/blob/ed1b8e3d88851806627e4f8262ee26232ca56c2c/collector/thermal_zone_linux.go#L31.
Enter
[N/A]
in the box, if an item is not applicable to your change.Testing
Before we can approve your change; please submit the following in a comment:
If this is a change to packaging of containers or native binaries then please confirm it works for all targets.
ok-package-test
label to test for all targets (requires maintainer to do).Documentation
Backporting
Fluent Bit is licensed under Apache 2.0, by submitting this pull request I understand that this code will be released under the terms of that license.